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ABSTRACT 

A  technique  is  presented  to  cluster  geospatial  features  on  an  electronic  display  and  determine  a  meaningful  measure 
of  display  clutter.  An  algorithm  previously  developed  by  the  Naval  Research  Laboratory  (NRL)  to  cluster  objects  in 
sidescan  imagery  has  been  modified  to  cluster  any  displayed  features  in  three  dimensions:  geospatial  location  (x,  y) 
and  color  (z).  This  paper  presents  preliminary  results  of  the  clustering  algorithm  and  cluster  density  calculations  for 
a  series  of  electronic  displays  with  varying  levels  of  clutter.  The  clutter  metric  correlates  with  preliminary, 
subjective  clutter  rankings.  Our  next  step  in  validating  this  method  will  be  correlating  the  metric  with  user 
performance. 


INTRODUCTION 


Objective 

The  objectives  of  this  project  are  to  develop  a  reliable  technique  for  measuring  clutter  in  navigation  displays  (e.g., 
electronic  charts)  and  to  link  this  clutter  metric  to  the  performance  of  a  person  using  the  display.  The  metric  then 
could  be  used  in  the  evaluation  of  new  displays  to  determine  the  optimum  amount  of  information  that  should  be 
displayed,  based  on  user  requirements. 

Background 

The  Navy  is  implementing  electronic  charts  throughout  the  fleet  and  has  already  installed  moving-map  displays  in 
many  of  its  aircraft,  including  the  F/A-18  and  AV-8B.  As  new  sources  of  information  become  available  for  display, 
and  as  new  and  innovative  display  techniques  are  developed,  there  is  a  tendency  to  display  everything  that  might  be 
of  interest  to  the  user.  These  new  displays  introduce  potential  human  factors’  issues  with  regard  to  the  ability  of  the 
user  to  access  and  interpret  the  displayed  information.  Many  studies  have  linked  display  complexity  to  user 
performance,  especially  in  terms  of  a  pilot's  ability  to  utilize  the  displayed  information  (e.g.,  Aretz,  1988;  Schons 
and  Wickens,  1993;  Wickens  and  Carswell,  1995).  The  last  two  reports  found  that  display  clutter  can  disrupt  a 
pilot's  visual  attention,  resulting  in  greater  uncertainty  concerning  target  locations.  When  a  moving-map  scrolls  at  a 
high  rate  of  speed,  as  in  the  case  of  a  large-scale  cockpit  display  in  a  fighter  jet,  the  chart's  effectiveness  can 
decrease  substantially.  As  nautical  electronic  charts  become  more  commonplace  in  the  Navy,  similar  problems  can 
be  expected  to  arise  for  this  user  community  as  well. 

While  researchers  have  demonstrated  a  link  between  user  performance  and  the  presence  of  so-called  “clutter” 
(which  can  include  both  the  overcrowding  of  otherwise  important  information  as  well  as  unwanted  data  or  noise), 
we  still  lack  a  reliable  method  of  automatically  quantifying  display  clutter  in  a  way  that  can  be  empirically  tied  to 
performance. 


APPROACH 


Hypothesis 

We  theorize  that  our  perception  of  clutter  is  primarily  related  to  saliency  and  color  uniformity.  Saliency  refers  to 
how  clearly  one  color  or  feature  “pops  out”  from  the  surrounding  features  in  an  image,  which  we  estimate  by  a 
weighted  average  of  color  gradients  between  adjacent  features.  Color  uniformity  refers  to  how  closely-packed  are 
similarly-colored  pixels  within  the  image.  To  discover  this  value,  we  have  adapted  a  clustering  algorithm,  which  we 
originally  developed  to  cluster  seafloor  objects  detected  in  sidescan  sonar  imagery.  The  algorithm  clusters  features 
detected  within  a  predetermined  geospatial  distance  from  each  other,  produces  vertices  for  a  bounding  cluster 
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polygon,  and  calculates  the  cluster’s  density  as  the  number  of  clustered  features  divided  by  the  area  of  the  polygon. 
For  this  project,  we  adapted  the  clustering  algorithm  to  operate  in  3D  space,  where  the  third  dimension  is  color.  Our 
“color  uniformity”  value  is  then  derived  from  the  density  of  similarly-colored  pixels  within  a  3D  cluster  (e.g., 
density  =  a  weighted  number  of  points  within  the  cluster  divided  by  the  cluster’s  volume). 

We  describe  image  clutter  in  terms  of  both  local  and  global  clutter  components.  A  Local  Clutter  Metric  (LCM) 
described  in  this  paper  represents  the  contribution  of  one  color  or  feature  to  the  overall  image  clutter,  and  equals  1 
minus  the  weighted  average  (by  area)  of  the  densities  of  all  clusters  centered  on  that  color  or  feature.  A  Global 
Clutter  Metric  (GCM)  represents  the  overall  image  clutter,  equal  to  the  weighted  average  of  the  LCM’s  for  all  colors 
or  features  in  the  image. 

Figure  1  illustrates  our  proposed  clutter  function,  in  terms  of  saliency  and  LCM/GCM.  The  following  sections 
describe  in  more  detail  how  each  of  these  metrics  is  calculated. 


u 

o 

1— 

o 


o 


I 


Mo  derate -to -high 
clutter:  Dithering 
or  low-contrast 
speckling  (usually 
in  raster  images) 


Lowest  clutter: 

Large,  gradually 
shaded  area 
features  (e.g. , 
shaded  contours 
in  bathy  image) 


Highest  clutter: 

many  small, 
distracting 
features  (e.g. , 
crowded  text , 
point  features) 


Low-to -mo  derate 
clutter:  Distinct 
area  features 
(e.g. ,  boundary 
between  tan  land 
&  blue  water) 


Low  * -  Salience  - ►  High 


Figure  1 .  Clutter  as  a  function  of  saliency  and  LCM  (for  local  clutter)  or  GCM  (global  clutter). 


3D  Clustering  using  Geospatial  Bitmaps  (GB) 

The  original  clustering  algorithm  relies  on  a  geospatial  bitmapping  (GB)  technique  patented  by  NRL  in  2001 
(Gendron,  et.  al,  2001).  The  clustering  algorithm  itself  was  disclosed  to  the  NRL  Legal  Office  as  another  potential 
patent  in  June  2003.  The  algorithm  is  unique  in  that  it  is  an  autonomous,  computationally  efficient  “single-pass” 
method  and  operates  on  a  user-defined  area  of  interest.  The  algorithm  performs  the  following  tasks:  1)  cluster 
features  by  2D  geospatial  location;  2)  smooth  and  simplify  the  resultant  cluster  boundaries  (optional);  and 
3)  calculate  a  numerical  measure  of  “cluster  density,”  which  considers  the  number  and  size  of  objects  clustered  in  a 
given  area,  as  well  as  the  scale  or  resolution  of  the  complete  dataset.  An  enhancement  to  the  original  algorithm  for 
this  project  is  the  ability  to  cluster  features  in  three  or  more  dimensions:  two  geospatial  (x,  y)  dimensions  plus  a  third 
(z)  dimension  such  as  color,  size,  or  feature  type.  This  paper  presents  preliminary  results  of  clustering  by  geospatial 
location  and  color. 


The  clustering  algorithm  discussed  in  this  paper  is  a  non-hierarchical  algorithm  similar  to  -  but  more  efficient  than  - 
Nearest  Neighbor  (NN),  which  iteratively  calculates  and  compares  the  distances  between  every  pair  of  elements  in 
the  dataset  to  determine  which  elements  should  be  clustered  together.  The  GB  algorithm  is  non-iterative,  faster,  less 
computationally  intensive,  and  requires  less  computer  memory  than  NN.  The  authors  suggest  that  the  GB  algorithm 
is  well-suited  to  autonomous  clustering  applications,  because  the  ordering  of  elements  input  to  the  GB  algorithm  has 
no  effect  on  the  resulting  clusters  (unlike  NN  and  other  single-pass  methods),  and  the  GB  algorithm  does  not  require 
a  seed  point  to  initiate  clustering  (unlike  K-means  and  other  relocation  methods). 

The  GB  algorithm  uses  simple  bitmaps  -  with  a  depth  of  one  bit  per  pixel  -  which  are  binary  structures  (e.g.,  binary 
images)  in  which  bits  are  turned  on  (set  =  1)  or  off  (cleared  =  0).  The  index  of  each  bit  is  unique  and  denotes  its 


position  relative  to  the  other  bits  in  the  bitmap.  In  a  2D  bitmap,  each  bit  is  indexed  by  its  column  (x)  and  row  (y);  in 
3D,  each  bit  is  indexed  by  x,  y,  and  depth  (z).  A  set  bit  indicates  that  an  element  of  interest  exists  at  that  location, 
accurate  to  within  the  resolution  of  the  bitmap.  A  cleared  bit  indicates  the  absence  of  any  element  at  that  location. 
Although  a  GB  can  be  defined  for  an  entire  finite  space,  memory  is  only  allocated  -  dynamically  -  when  groups  of 
spatially  close  bits  are  set,  resulting  in  a  compact  data  structure  that  supports  very  fast  Boolean  and  morphological 
operations. 

For  this  project,  3D  bitmaps  were  used  to  cluster  the  pixels  in  an  image  of  interest,  based  on  geospatial  location  (x, 
y)  and  color  (z).  A  separate  clustering  was  performed  for  each  color  in  the  image.  For  example,  figure  2  illustrates 
the  results  of  clustering  the  shoreline  pixels  (darker  brown  color)  in  the  sample  image  (left).  All  pixels  within  a 
geospatial  distance  of  1  (x  and  y)  and  a  color  distance  of  9  (using  the  Commission  Internationale  d’Eclairage  (CIE) 
L*a*b*  color  space)  are  included  in  the  clusters,  which  are  shown  in  green  (right).  In  this  case,  the  resulting  clusters 
only  contain  the  shoreline  pixels  themselves.  If  z  were  increased  to  10,  every  pixel  in  this  image  would  be  contained 
in  a  single  cluster,  because  every  pixel  in  this  image  is  immediately  surrounded  by  pixels  that  are  within  a  color 
distance  of  10  in  CIE  L*a*b*  space. 


Figure  2.  Example  of  clustering  by  geospatial  location  and  color:  all  pixels  within  a  predetermined 
distance  in  geospatial  (x=1,  y=1)  and  color  (z=9)  space  of  the  shoreline  pixels  (brown  pixels  in  the  original 
image,  left)  are  clustered  together.  The  resulting  clusters  are  shown  in  green,  right.  The  zoomed-in 
section  shows  a  detail  of  the  pixels  being  clustered. 


Calculating  Cluster  Density 

After  clustering  all  pixels  in  the  image  into  bounded  polygons  for  a  given  “seed  color”  s,  a  cluster  density  DP  is 
calculated  for  each  cluster  polygon  p: 

DP  =  E(WCNC)  /  AP  where: 

Wc  =  Weighting  factor  for  color  c 

=  1  -  Ec / M 

Ec  =  Euclidean  distance  between  colors  c  and  s  in  the  chosen  color  space;  e.g.,  for  CIE  L*a*b: 

=  SQRT  [(Lc  -  Ls)2  +  (ac  -  as)2  +  (bc  -  bs)2] 

M  =  Maximum  distance  between  colors  in  chosen  color  space 
Nc  =  Number  of  pixels  of  color  c  in  the  cluster  polygon 
AP  =  Area  of  cluster  polygon  p 

The  color  of  each  pixel  in  the  cluster  will  be  within  a  color  distance  of  z  from  all  immediately  surrounding  pixels  in 
the  cluster,  starting  with  pixels  of  color  s.  In  other  words,  the  cluster  will  “chain”  pixels  together  to  form  the 
cluster,  starting  with  each  pixel  of  color  s  and  subsequently  including  all  other  pixels  within  a  geospatial  distance  of 
x,  y  and  a  color  distance  of  z.  If  z  =  0,  then  DP  =  Ns  /  AP. 


Note  the  inverse  relationship  between  clutter  and  “density”  as  it  is  used  here:  higher  density  tends  to  predict  lower 
clutter,  since  density  describes  how  closely-packed  like-pixels  are  in  the  image. 

Local  and  Global  Clutter  Metrics 


Local  density  (Ds)  estimates  how  much  an  individual  seed  color  (s)  contributes  to  the  overall  clutter  of  the  image. 
Ds  is  computed  as  the  weighted  average  of  the  densities  for  all  clusters  centered  on  color  s: 

Ds  =  £(DP  AP)  /  As  where: 

DP  =  Density  of  cluster  p  (described  in  the  previous  section) 

As  =  Sum  of  areas  of  all  clusters  for  color  s 

Global  density  (D|),  which  estimates  clutter  for  the  entire  image,  is  computed  as  the  weighted  average  of  the  local 
clutter  densities  for  all  colors  in  the  image: 

D,  =  D(DS  As)  /  A|  where: 

Ds  =  Weighted  average  of  clutter  densities  for  all  clusters  centered  on  color  s  (described  above) 

A|  =  Sum  of  all  As’s  for  image  i 

We  will  refer  to  (1-  Ds)  as  the  Local  Clutter  Metric  (LCM)  for  color  s,  and  (1-  D|)  as  the  Global  Clutter  Metric 
(GCM)  for  image  i. 

Saliency 

We  estimate  the  local  saliency  of  a  given  color  or  feature  as  a  weighted  average  of  the  color  differences  between 
each  color  or  feature  of  interest  and  immediately  adjacent  colors  or  features.  For  example,  if  one  feature  in  the 
image  (e.g.,  a  yellow  lighthouse  symbol  on  a  nautical  chart)  is  completely  surrounded  by  another  feature  (e.g.,  solid 
blue  water),  we  would  estimate  the  saliency  of  the  lighthouse  as  the  Euclidean  distance  between  these  two  colors 
(yellow  and  blue)  in  a  perceptually  representative  color  space.  If  this  lighthouse  symbol  were  placed  on  a  shoreline 
(brown),  such  that  40%  of  the  lighthouse  symbol  was  bordered  by  the  blue  water,  40%  by  tan  land,  and  20%  by  the 
brown  shoreline,  we  would  estimate  the  saliency  of  the  lighthouse  by  0.4*(blue-yellow)  +  0.4* (tan-yellow)  + 
0.2* (brown-yellow).  Global  saliency  is  estimated  as  the  weighted  average  of  the  local  saliencies  for  all  colors  (or 
features)  in  the  image.  Greater  color  distances  result  in  greater  saliency. 


Figure  3.  CIE  L*a*b*  color  space,  with  colors  from  a  sample  electronic 
display  plotted  using  the  ColorSpace  program  (Colantoni,  2005). 

The  choice  of  an  appropriate  color  space  is  central  to  this  theory.  Unfortunately,  no  single  color  space  has  been 
shown  to  perfectly  model  human  visual  perception.  For  this  paper,  we  chose  the  CIE  L*a*b*  color  space  (pictured 
in  figure  3),  but  we  are  continuing  to  search  for  improved  options. 


RESULTS 


Clutter  Metrics  for  Sample  Chart  Series 

Figure  4a  presents  a  sample  image  (a  nautical  chart)  comprised  of  9  primary  features  and  18  total  colors.  Figure  4b 
shows  the  LCM  for  each  color  component  and  each  feature  in  the  image.  The  LCM  does  not  account  for  saliency, 
which  might  explain  why  certain  features  (e.g.,  the  green  feature  off  the  northeast  corner  of  the  small  island)  are 
listed  with  higher  clutter  values  than  expected. 

Figure  5  presents  the  GCM  for  five  different  charts  in  this  nautical  chart  series.  As  more  features  are  added  to  the 
chart,  the  GCM  increases,  indicating  higher  levels  of  clutter. 
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Figure  4a).  Sample  nautical  chart  comprised  of  9  primary  features  and  18  total  colors.  4b)  LCM  for  each 
color  and  feature  in  the  image  (sorted  from  lowest  to  highest  LCM).  Higher  LCM  values  indicate  greater 
contributions  to  image  clutter,  but  do  not  account  for  saliency. 


Figure  5.  GCM  for  five  charts  in  a  nautical  chart  series. 


Correlation  between  Subjective  Clutter  Ratings  and  Saliency,  LCM,  and  GCM 

As  a  first  attempt  to  ascertain  whether  the  proposed  clutter  metrics  corresponded  at  all  with  perceived  clutter  (prior 
to  receiving  approval  to  perform  human  subject  trials),  the  first  author  rated  the  individual  features  of  several 


images,  to  indicate  how  much  each  feature  contributed  clutter  to  the  image,  according  to  the  following  rudimentary 
rating  scale: 

Low  clutter:  feature  contributes  very  little  clutter  to  image 
Medium  clutter:  feature  contributes  some  clutter  to  image 
High  clutter:  feature  contributes  substantial  clutter  to  image 

For  330  features  rated  from  28  images,  the  probability  that  a  correlation  between  subjective  ratings  of  local  clutter 
and  each  tested  clutter  metric  (alone)  was  due  to  chance  was  p<0.0001  (x2  =  314.3)  for  LCM,  and  p<0.0001  (x2  = 
27.9)  for  salience.  An  ordinal  logistic  fit  of  the  ratings  with  a  combination  of  both  metrics  was  slightly  more 
significant  than  with  the  LCM  metric  alone  (x2  =  316.3).  Figure  6  plots  local  salience  figures  vs.  LCM,  with  each 
data  point  colored  according  to  its  subjective  clutter  rating. 
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Figure  6.  Comparison  of  subjective  clutter  ratings  with  local  saliency  vs.  LCM. 


Likewise,  the  author  rated  the  images  for  how  cluttered,  overall,  each  appeared: 

Low  clutter:  very  easy  to  read  image 

Medium  clutter:  can  read  some  information  on  the  image;  other  information  is  hard  to  discern 
High  clutter:  difficult  to  make  sense  of  the  image 

For  28  images,  the  probability  that  a  correlation  between  the  subjective  ratings  of  global  clutter  and  each  tested 
clutter  metric  (alone)  was  due  to  chance  was  p<0.0001  (x2=  25.4)  for  GCM,  andp<  0.05  (x2  =  4.1)  for  salience.  An 
ordinal  logistic  fit  of  the  subjective  ratings  with  a  combination  of  both  metrics  was  more  significant  than  with  either 
metric  alone:  p<0.0001  (x2=  38.0).  Figure  7  plots  global  salience  figures  vs.  GCM,  with  each  data  point  colored 
according  to  its  subjective  clutter  rating. 

Of  course,  with  only  a  single  subject,  these  results  are  very  preliminary.  A  follow-on  study  is  underway  to  collect 
subjective  ratings  from  a  larger  user  population  for  comparison  with  these  clutter  metrics. 


Global  Clutter  Metrics  for  28  Charts:  Comparison 
with  Preliminary  Subjective  Clutter  Ratings 
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Figure  7.  Comparison  of  subjective  clutter  ratings  with  global  salience  vs.  GCM. 


DISCUSSION 

The  theory  proposed  at  the  beginning  of  this  paper  appears  to  have  merit:  a  preliminary  subjective  evaluation  of  both 
local  and  global  clutter  seems  to  correlate  with  the  LCM  and  GCM  values,  especially  when  saliency  is  considered  in 
conjunction  with  these  metrics.  Four  global  clutter  examples  are  provided  in  figure  8,  illustrating  the  extremes  for 
GCM  and  salience. 


Figure  7a.  Low  global  saliency  and 
high  GCM  =  moderate-to-high  clutter. 


Figure  7b.  High  global  saliency  and 
high  GCM  =  high  clutter. 


Figure  7c.  Low  global  saliency  Figure  7d.  High  global  saliency  and 

and  low  GCM  =  low  clutter.  low  GCM  =  low-to-moderate  clutter. 


For  example,  Low  GCM  (i.e.,  high  color  uniformity)  and  low  saliency  tends  to  result  in  low  clutter,  such  as  in 
images  with  relatively  large,  uniform  areas  that  do  not  contrast  sharply  with  each  other  (e.g.,  figure  7c).  Conversely, 
high  GCM  and  high  saliency  tends  to  result  in  high  clutter  (e.g.,  figure  7b).  Low  saliency  and  high  GCM  tends  to 
result  in  moderate-to-high  clutter  (e.g.,  figure  7a),  in  which  the  image  is  populated  with  smaller  features  that  don’t 
contrast  well  and  may  be  difficult  to  discern.  Finally,  high  saliency  and  low  GCM  results  in  low-to-moderate 
clutter,  in  which  relatively  large  features  contrast  sharply  with  each  other  (such  as  the  land  and  water  areas,  clearly 
delineated  by  black  shoreline  boundaries,  in  figure  7d). 

The  next  phase  of  this  project  includes  analyzing  over  150  images  of  various  types  (e.g.,  aeronautical  charts,  nautical 
charts,  topographic  maps,  city/road  maps,  weather  /  meteorological  charts,  subway  maps,  airport  terminal  maps,  etc.) 
to  correlate  subjective  evaluations  of  clutter  with  the  GCM,  LCM,  and  saliency  metrics  described  in  this  paper.  User 
performance  studies  are  also  planned,  to  determine  whether  these  metrics  can  predict  performance. 
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